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REMARKS 



: foregoing amendment and the following 



Reconsideration and allowance in view of Ir e 
remarks are respectfully requested Claims 22, 23 a,d 30 are amended without prejudice or 
disclaimer. Applicant reserves the right to pursue b oader claims in a continuation application. 

Refection ofClaim s 22-25. 27, ?.0.3? ™h y 



The Office Action rejects claims 22-25, 27. : 9-32 and 34 under 35 U.S.C. §1 03(a) as 



Synthesis by Morphing Speech 



ngs of using triphone segments as the a 



being unpatentable over Ezzat et al. ("Visual Speech 

Analysis»)(«Ezzat ct a , n in view of Jiang et a , ^ suaf Speech ^ ^ 

Mandarin Speech Training»)(«Jiang et al ») in view < f Bregler et al. We note that Applicant does 
not acquiesce to any broadening of the teachings of t ;re g Ier et al. as being admitted prior art. For 
example, the specification notes on page 2 that Bregl :r et al. utilize triphone segments as the a 
priori units of video, thus causing an lack of natural f ow. Thus, Applicant does not admit that 
the breadth of Bregler et al. extends beyond the teach 
priori units of video. 

Applicant first traverses the combination of eizat et al., Jiang et al. and Bregler et al. For 
example, the office action states that it would be obvi, ,us to combine Ezzat et al. with Jiang et al., 
but give no analysis of why other than to state that Jiang et al. teach an advantage of obtaining 
feature vectors to help children improve speech pronunciations. The Examiner has the burden of 
establishing a prima facie case of obviousness and Ap ,licant submits that such a case has not 
been made. The reason one of skill in the art would „< t be motivated to combine these 
references is that Ezzat et al.'s disclosure relates to the process of providing speech synthesis by 
morphing viseme, As stated in the Abstract, visemes tre a small set of images spanning a large 
range of mouth shapes. The viseme database is creatcc by recording a human subject in a 
manner specifically designed to elicit one instantiation >f each viseme. One of skill in the art 
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will understand the technical process in generating 
morphing those visemes to create a smooth antmatijn 

Jiang et al. on the other hand, discloses a 
Chinese to oral deaf children. The portion of the 
frame, region of interest is identified and key 
of a camera that video tapes a child speaking 
and the language analyzed. Later, a talking head 
presented to the child to help with correcting errors, 
representation of the child and made from the child's 
visemes or suggestion that the cloned talking head is 
Therefore, Applicant submits that one of skill in the 
and would not be motivated to use Jiang ct al.'s 
can apply to visemes. 

Next, Applicant submits that claims 22, 23 anc 
overcome the prior art even if combined. For exampl 
synthesis of photo-realistic animation of an object 
candidate image samples utilizing the target feature 
of the object, wherein generating the photo-realistic 
audio/video unit selection process in which a longest 
selected. There is no teaching within the prior art 
which the longest possible candidate image sample is 
teach that the visual corpus is digitized at 30 fps. Ezza t 
intermediate frames that lie between the chosen viseme 
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n viseme database and the process of 



art 



i feature 



analysis system for teaching mandarin 
cited in the Office Action, "at each 
is extracted;' clearly is in the context 
such that the child can be video taped 
is a representation of the child) is 
The talking head recreated here is a 
own images. There is no discussion of 
presented using a viseme process, 
would recognize the technical differences 
vectors because there is no hint that they 



30 as amended recite limitations that 
, claim 22 recites a method for the 
method includes the step of selecting 
to generate a photo-realistic animation 
an ination of the object occurs uses an 
ible candidate image sample is 
regarding a selection process in 
selected, in Ezzat et al., for example, they 
has the need for dealing with many 
images. Accordingly, Applicant 



The 



vcitorl 



p?ssil 



of re :ord 



S3 

r— 
m 

O 

o 



7 



800® 



XVd Jf.ZZ 9003/81/80 



8H0:(ss-(UUi) N0IM1Q . :QIS3 • OOCTOSINQ < MmmiMm i Ni UJ3JSB3] Kid \ \ 900218 U8 IV QA3H » 1 U6 39Vd 

Docket No.: 2000-0042-CON 



Application/Control Number; 10/662.550 
Art Unit: 2628 



submits that claim 22, claim 23 and its dependent cl «m S and claim 30 and its dependent claims 
are patentable over the cited references for the sever al reasons set forth above. 
Rejection of riaimc 2 8 and gg IFn/W *g n s /- f} lf n^ 



The Office Action rejects claims 28 and 35 u *der 35 U.S.C. § 1 03(a) as being 
unpatentable over E^at et al. in view of Jiang et al. n further view of Breg,er « al. h further 
view of Brand ("Voice Puppetry") ("Brand"). Appli JS 
Brand fails to teach the limitations of claims 28 and 
The Office Action equates page 25, column 1 
sequent with the step in claim 28 of selecting for ea :h frame a number of candidate image 
samples from the first database based on the target feature vector. The parent claim 22 requires 
the target feature vector to have a visual and a non-vh ual aspect to it. This feature is not taught 
in Brand. Further, Brand fails to te ach the step of call ing a concatenation cost from a 



combination of visual features from the second database and object characteristics from the third 
database. Therefore, even if combined, Brand fails to 



as claim 35. 

Furthermore, Brand teaches away from its combination with the other references. In 
fact, Brand expressly distances itself from the Bregler 
citation [7J in Bregler et al. On page 22, top of col. 1, 

systems an, based on an intermediate phonemic representation whether obtained by hand, text or 
via speech recognition (citing Bregler et al. and others) Brand cites Bregler et al. again in that 
paragraph noting that 



ant traverses this rejection and submits that 



's discussion in Brand regarding the Viterbi 



teach each limitation of olaim 28 as well 



;t al. reference. Bregler et al. is listed as 
3rand states that nearly all lip-syncing 



[7] works by re-ordermg ex.sting video frames rather than by generating animations it 
deserves mention because it partially models vo<*l (but not facial) co-articuZon wTt'h 
tnphones - phonemes plus one I unit of left and right co^Th^tJ^^ -T^ 
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vhich fails to "address the actual dynamics 
is cast as teaching undesirable techniques to 



CO " Strain the S0,Uti0n and no vid < ° wi » P rov ide an adequate stock of 
None of these methods address the actual dynamics of the face » 
Accordingly, it is clear that Brand makes special mention of the Bregler et al. article 

the background section to highlight it as a reference 

of the face." In this respect, the Bregler et al. article 

performing animation. In this regard, Applicant respectfully submits that there cannot be any 
motivation or suggestion to combine these two refcre nces where there are express teachings 
away from such combination. 

Similarly, Brand discloses in the same part of page 22 that references that use "visemic 
tokens" are also deficient in that they fail to address t ,e actual dynamics of the face. 
Accordingly, one of skill in the art would recognize tf at Brand teaches away from using the 
viseme approach taught by Ezzat et al. As another sample of how Brand distances his 
teachings from a viseme-based approach, he states «C msiderable information can be lost when 
discretizing to phonemic or visemic representations." Col. 1, page 22. Accordingly, Bnmd als< 
cannot be combined with Ezzat et al. because if his exjress teachings away from such 
combination. 
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Based on the foregoing reasons, Applicant submits that claims 28 and 35 are patentable 
and in condition for allowance. 
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CONC 



■PSION 



Having addressed all rejections and objecti 
subject application is in condition for allowance 
If necessary, the Commissioner for Patents is 
Stelacone & Prass, LLC, Account No. 502960 



io is 



anc 



author ized 



for any 



Date: August 18,2006 

Correspondence AHHr^^c- 
Thomas A. Restaino 
Reg. No. 33,444 
AT&T Corp. 
Room 2A-207 
One AT&T Way 
Bedmin$ter, NJ 07921 



IcspectruIIy submitted, 

ty: /Thomas M. Isaacson/ ^>^L, 
' 'nomas M Isaacson 
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Applicant respectfully submits that the 
a Notice to that effect is earnestly solicited, 
to charge or credit the Isaacson, Irving, 
deficiency or ov^y^-^ 



^.ttorney for Applicant 

eg. No. 44,166 
Plhone: 410-286-9405 
ax No.: 410-510-1433 
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